concurrent environment
Google Brain's DRL Helps Robots 'Think While Moving' - Synced
When chasing a bouncing ball, a human will head where they anticipate the ball is going. If things change -- for example a cat swats the ball and it bounces off in a new direction -- the human will correct to an appropriate new route in real time. Robots can have a hard time making such changes, as they tend to simply observe states, then calculate and execute actions, rather than thinking while moving. Google Brain, UC Berkeley, and X Lab have proposed a concurrent Deep Reinforcement Learning (DRL) algorithm that enables robots to take a broader and more long-term view of tasks and behaviours, and decide on their next action before the current one is completed. The paper has been accepted by ICLR 2020.
Thinking While Moving: Deep Reinforcement Learning with Concurrent Control
Xiao, Ted, Jang, Eric, Kalashnikov, Dmitry, Levine, Sergey, Ibarz, Julian, Hausman, Karol, Herzog, Alexander
We study reinforcement learning in settings where sampling an action from the policy must be done concurrently with the time evolution of the controlled system, such as when a robot must decide on the next action while still performing the previous action. Much like a person or an animal, the robot must think and move at the same time, deciding on its next action before the previous one has completed. In order to develop an algorithmic framework for such concurrent control problems, we start with a continuous-time formulation of the Bellman equations, and then discretize them in a way that is aware of system delays. We instantiate this new class of approximate dynamic programming methods via a simple architectural extension to existing value-based deep reinforcement learning algorithms. We evaluate our methods on simulated benchmark tasks and a large-scale robotic grasping task where the robot must "think while moving".